190 research outputs found
A Focused Study on Sequence Length for Dialogue Summarization
Output length is critical to dialogue summarization systems. The dialogue
summary length is determined by multiple factors, including dialogue
complexity, summary objective, and personal preferences. In this work, we
approach dialogue summary length from three perspectives. First, we analyze the
length differences between existing models' outputs and the corresponding human
references and find that summarization models tend to produce more verbose
summaries due to their pretraining objectives. Second, we identify salient
features for summary length prediction by comparing different model settings.
Third, we experiment with a length-aware summarizer and show notable
improvement on existing models if summary length can be well incorporated.
Analysis and experiments are conducted on popular DialogSum and SAMSum datasets
to validate our findings.Comment: Preprint version - ICASSP submissio
MeaeQ: Mount Model Extraction Attacks with Efficient Queries
We study model extraction attacks in natural language processing (NLP) where
attackers aim to steal victim models by repeatedly querying the open
Application Programming Interfaces (APIs). Recent works focus on limited-query
budget settings and adopt random sampling or active learning-based sampling
strategies on publicly available, unannotated data sources. However, these
methods often result in selected queries that lack task relevance and data
diversity, leading to limited success in achieving satisfactory results with
low query costs. In this paper, we propose MeaeQ (Model extraction attack with
efficient Queries), a straightforward yet effective method to address these
issues. Specifically, we initially utilize a zero-shot sequence inference
classifier, combined with API service information, to filter task-relevant data
from a public text corpus instead of a problem domain-specific dataset.
Furthermore, we employ a clustering-based data reduction technique to obtain
representative data as queries for the attack. Extensive experiments conducted
on four benchmark datasets demonstrate that MeaeQ achieves higher functional
similarity to the victim model than baselines while requiring fewer queries.
Our code is available at https://github.com/C-W-D/MeaeQ.Comment: Accepted by EMNLP 2023 main conferenc
An Overview on Language Models: Recent Developments and Outlook
Language modeling studies the probability distributions over strings of
texts. It is one of the most fundamental tasks in natural language processing
(NLP). It has been widely used in text generation, speech recognition, machine
translation, etc. Conventional language models (CLMs) aim to predict the
probability of linguistic sequences in a causal manner. In contrast,
pre-trained language models (PLMs) cover broader concepts and can be used in
both causal sequential modeling and fine-tuning for downstream applications.
PLMs have their own training paradigms (usually self-supervised) and serve as
foundation models in modern NLP systems. This overview paper provides an
introduction to both CLMs and PLMs from five aspects, i.e., linguistic units,
structures, training methods, evaluation methods, and applications.
Furthermore, we discuss the relationship between CLMs and PLMs and shed light
on the future directions of language modeling in the pre-trained era
Mixture theory-based SPH model for submerged landslide
A novel SPH model aimed at solving the coupled water-soil problems is proposed based on the mixture theory. This method is featured with the spatially overlapped dual continua for both fluid and solid phases. The water phase is modeled as a weakly-compressible Newtonian fluid, and the soil phase is modeled using an elastoplastic constitutive model. The benchmark problem, fully submerged soil subjected to gravity, is examined to validate this SPH model. Finally, a submerged landslide is simulated to demonstrate the capability of the proposed SPH model in solving the dynamic soil–water coupling problems
Carbon spheres as lubricant additives for improving tribological performance of polyetheretherketone
Simple One-Pot Synthesis of Hexagonal ZnO Nanoplates as Anode Material for Lithium-Ion Batteries
Hexagonal ZnO nanoplates were synthesized via simple one-pot hydrothermal reaction of Zn(CH3COO)2 and CO(NH2)2. XRD, SEM, and HRTEM were used to investigate the composition and microstructure of the material. Together with the facile strain relaxation during structure and volume change upon cycling, this plate-like structure of ZnO is favorable for physical and chemical interactions with lithium ions because of its large contact area with the electrolyte, providing more active sites and short diffusion distances. The resulting hexagonal ZnO nanoplates electrode exhibited good cyclability and delivered a reversible discharge capacity of 368 mAh g−1 after 100 cycles at 0.1 C
Bias and Fairness in Chatbots: An Overview
Chatbots have been studied for more than half a century. With the rapid
development of natural language processing (NLP) technologies in recent years,
chatbots using large language models (LLMs) have received much attention
nowadays. Compared with traditional ones, modern chatbots are more powerful and
have been used in real-world applications. There are however, bias and fairness
concerns in modern chatbot design. Due to the huge amounts of training data,
extremely large model sizes, and lack of interpretability, bias mitigation and
fairness preservation of modern chatbots are challenging. Thus, a comprehensive
overview on bias and fairness in chatbot systems is given in this paper. The
history of chatbots and their categories are first reviewed. Then, bias sources
and potential harms in applications are analyzed. Considerations in designing
fair and unbiased chatbot systems are examined. Finally, future research
directions are discussed
Towards Understanding Third-party Library Dependency in C/C++ Ecosystem
Third-party libraries (TPLs) are frequently reused in software to reduce
development cost and the time to market. However, external library dependencies
may introduce vulnerabilities into host applications. The issue of library
dependency has received considerable critical attention. Many package managers,
such as Maven, Pip, and NPM, are proposed to manage TPLs. Moreover, a
significant amount of effort has been put into studying dependencies in
language ecosystems like Java, Python, and JavaScript except C/C++. Due to the
lack of a unified package manager for C/C++, existing research has only few
understanding of TPL dependencies in the C/C++ ecosystem, especially at large
scale.
Towards understanding TPL dependencies in the C/C++ecosystem, we collect
existing TPL databases, package management tools, and dependency detection
tools, summarize the dependency patterns of C/C++ projects, and construct a
comprehensive and precise C/C++ dependency detector. Using our detector, we
extract dependencies from a large-scale database containing 24K C/C++
repositories from GitHub. Based on the extracted dependencies, we provide the
results and findings of an empirical study, which aims at understanding the
characteristics of the TPL dependencies. We further discuss the implications to
manage dependency for C/C++ and the future research directions for software
engineering researchers and developers in fields of library development,
software composition analysis, and C/C++package manager.Comment: ASE 202
Detect Depression from Social Networks with Sentiment Knowledge Sharing
Social network plays an important role in propagating people's viewpoints,
emotions, thoughts, and fears. Notably, following lockdown periods during the
COVID-19 pandemic, the issue of depression has garnered increasing attention,
with a significant portion of individuals resorting to social networks as an
outlet for expressing emotions. Using deep learning techniques to discern
potential signs of depression from social network messages facilitates the
early identification of mental health conditions. Current efforts in detecting
depression through social networks typically rely solely on analyzing the
textual content, overlooking other potential information. In this work, we
conduct a thorough investigation that unveils a strong correlation between
depression and negative emotional states. The integration of such associations
as external knowledge can provide valuable insights for detecting depression.
Accordingly, we propose a multi-task training framework, DeSK, which utilizes
shared sentiment knowledge to enhance the efficacy of depression detection.
Experiments conducted on both Chinese and English datasets demonstrate the
cross-lingual effectiveness of DeSK
- …